100 research outputs found

    Gibbs sampling will fail in outlier problems with strong masking

    Get PDF
    This paper discusses the convergence of the Gibbs sampling algorithm when it is applied to the problem of outlier detection in regression models. Given any vector of initial conditions, theoretically, the algorithm converges to the true posterior distribution. However, the speed of convergence may slow down in a high dimensional parameter space where the parameters are highly correlated. We show that the effect of the leverage in regression models makes very difficult the convergence of the Gibbs sampling algorithm in sets of data with strong masking. The problem is illustrated in several examples

    Selection of variables for cluster analysis and classification rules

    Get PDF
    In this paper we introduce two procedures for variable selection in cluster analysis and classification rules. One is mainly oriented to detect the noisy non-informative variables, while the other deals also with multicolinearity. A forward-backward algorithm is also proposed to make feasible these procedures in large data sets. A small simulation is performed and some real data examples are analyzed.Comment: 28 pages, 7 figure

    A multivariate Kolmogorov-Smornov test of goodnes of fit

    Get PDF
    This paper presents a distribution free multivariate Kolmogorov-Smirnov goodıness of fit test. The test uses an statistic which is built using Rosenblatt's transformation and an algorithm is developed to compute it in the bivariate case. An approximate test, that can be easily computed in any dimension, is also presented. The power of these multivariate tests is studied in a simulationı study

    Bayesian unmasking in linear models

    Get PDF
    We propose a Bayesian procedure for multiple outlier detection in linear models avoiding the masking problem. Our proposal is illustrated with several examples in which our procedure outperforms other recent methods for multiple outlier detection. The posterior probabilities of each data point being an outlier are estimated by using a new adaptive Gibbs sampling method, which modifies the initial conditions of the Gibbs sampler by using the eigenstructure of the covariance matrix of the indicator variables. This procedure also overcomes the false convergence of the Gibbs sampling in problems with strong masking

    Forecasting with missing data: Application to a real case

    Get PDF
    This paper presents a comparative analysis of linear and mixed models for short term forecasting of a real data series with a high percentage of missing data. Data are the series of significant wave heights registered at regular periods of three hours by a buoy placed in the Bay of Biscay. The series is interpolated with a linear predictor which minimizes the forecast mean square error. The linear models are seasonal ARIMA models and the mixed models have a linear component and a non linear seasonal component. The non linear component is estimated by a non parametric regression of data versus time. Short term forecasts, no more than two days ahead, are of interest because they can be used by the port authorities to notice the fleet. Several models are fitted and compared by their forecasting behavior.Significant wave height, mean square error, linear interpolation, ARIMA models, nonparametric smoothing

    Estudio de una década de intoxicaciones infantiles en un hospital terciario

    Get PDF
    Se realizó un estudio retrospectivo, observacional y descriptivo de las intoxicaciones en menores de edad atendidas en el servicio de urgencias del Hospital Universitario Río Hortega de Valladolid, desde el 2001 al 2010. Se recogieron 994 casos en los que se determinaron las características epidemiológicas, toxicológicas, clínicas y terapéuticas de dichas intoxicaciones y su evolución temporal a lo largo de una década. Resultados: incidencia del 0,41% de las urgencias atendidas en menores de 18 años, siendo el 53,72% varones, con una edad media de 10,15+/- 6,3 años. El diagnóstico principal fue la intoxicación etílica 43,66%, seguido por la medicamentosa 26,86%, productos del hogar 8,14% y monóxido de carbono 8,04%. El 40,4% acudieron en ambulancia, el 58,9% recibieron algún tipo de tratamiento, el 8,9% precisaron interconsulta con la Unidad de Salud Mental y el 8,65% con el Servicio de Toxicología.Departamento de Medicina, Demartología y Toxicologí

    Algoritmos adaptativos de Gibbs Sampling para la identificación de heterogeneidad en regresión y series temporales

    Get PDF
    El objetivo principal de esta tesis doctoral es desarrollar nuevos procedimientos para la identificación de observaciones atípicas que introducen heterogeneidad en muestras con datos independientes y dependientes. Se proponen dos algoritmos diferentes para los problemas de regresión y series temporales basados en el algoritmo de Gibbs Sampling. Al igual que sucede con los métodos clásicos de identificación de valores atípicos, se demuestra que la aplicación estándar del Gibbs Sampling no proporciona una identificación correcta de estos valores atípicos en problemas que presentan grupos de observaciones atípicas enmascaradas. Dado un vector cualquiera de valores iniciales, teóricamente el algoritmo converge a la verdadera distribución a posteriori de los parámetros, sin embargo, la velocidad de convergencia puede ser extremadamente lenta cuando el espacio paramétrico tiene dimensión alta y los parámetros están muy correlacionados. Los nuevos algoritmos que se discuten en este trabajo permiten mediante un proceso de aprendizaje adaptar las condiciones iniciales del Gibbs Sampling y mejorar su convergencia a la distribución a posteriori de los parámetros del modelo

    Propuesta de metodología de trabajo para la obtención de cartografía a partir de datos LiDAR registrados en una zona rústica.

    Get PDF
    Tras la llegada de la medición mediante LiDAR, la obtención de cartografía se ha visto facilitada, obteniendo modelos digitales con gran rapidez y precisión. No obstante, para poder tratar la gran cantidad de información registrada, se necesita emplear un conjunto de algoritmos que permita extraer los detalles importantes y necesarios de la zona registrada. Por ello, se presenta este trabajo donde se expondrá una metodología de actuación para obtener cartografía a escala 1/1000 de una zona rústica, basada en el cálculo de mapas de curvas de nivel y ortofotografías, generadas a partir de los MDT y MDS de la zona. Todas las pruebas se han realizado mediante el software MDTopX. Abstract: After the arrival of the LiDAR measurement, mapping has been facilitated, obtaining digital models very quickly and accurately. However, in order to manage the great amount of recorded information, a set of algorithms is required which allows the extracting of important and necessary details of the recorded area. Therefore, a methodology is presented for mapping at 1/1000 scale of a rural area, based on contour maps and orthophotos, generated from the DTM and DSM of the area. All tests were performed using MDTopX software

    Local meteorological conditions, shape and desiccation influence dispersal capabilities for airborne microorganisms

    Get PDF
    The atmosphere plays an important role in the dispersal of microorganisms, as well as in the connectivity of most of the planet's ecosystems. In recent decades, interest in microbial diversity and dispersion in the atmosphere has increased due to its importance in various fields. However, there are few studies on the abundance of airborne microorganisms and the factors, such as meteorology, that affect their distribution. Likewise, the physical-mathematical models attempting to reproduce their possible origins also require integrating some biological features. We collected airborne microorganisms under different meteorological conditions at a sampling station over a 12-day period to expand the knowledge about abundance of airborne microorganisms, their relationship with atmospheric conditions and their possible origins with a biological perspective. Total abundance and size distribution of microorganisms were measured in all samples using epifluorescence techniques. Their possible origins were estimated using refined mathematical simulation models of the air masses back-trajectories considering dry deposition. Our results showed microbial abundance values similar to those found in temperate regions over land surface. In our contribution we report a clear relationship between the abundance and, considered as a whole, local meteorological conditions. Despite most of the captured particles were small spherical microorganisms (diameter < 20 μm), large filamentous microorganisms, surprisingly up to 400 μm, were also found. We demonstrate the possibility that these large microorganisms can have their origin at long distances, showing thus probability of remarkable long dispersal, without ruling out a nearby origin, when their equivalent spherical diameter (ESD) and drying capacity are consideredThis work was supported by the Spanish Agencia Estatal de Investigación (AEI) and Fondo Europeo de Desarrollo Regional (FEDER), Grant CTM2016-79741-R. SGal was supported by a Fomento de la Investigación-aid fellowship Master Studies-UAM 2019 from Universidad Autónoma de Madri
    corecore